Spectral coarse graining for random walk in bipartite networks 
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Many real-world networks display a natural bipartite structure, while analyzing or visualizing 
large bipartite networks is one of the most challenges. As a result, it is necessary to reduce the 
complexity of large bipartite systems and preserve the functionality at the same time. We observe, 
however, the existing coarse graining methods for binary networks fail to work in the bipartite 
networks. In this paper, we use the spectral analysis to design a coarse graining scheme specifically 
for bipartite networks and keep their random walk properties unchanged. Numerical analysis on 
artificial and real-world bipartite networks indicates that our coarse graining scheme could obtain 
much smaller networks from large ones, keeping most of the relevant spectral properties. Finally, 
we further validate the coarse graining method by directly comparing the mean first passage time 
between the original network and the reduced one. 

PACS numbers: 89.75.HC,02.50.-r 



I. INTRODUCTION 

As a backbone of many complex systems, complex net- 
works have been intensively studied in the past decade. 
Examples range from social relationships among individ- 
uals, to interactions of proteins in biological systems, to 
the interdependence of function calls in large software 
projects. The network analysis has greatly helped us 
understand the structure and function of real-world sys- 
tems urn. 

Bipartite network is an important kind of complex net- 
work, which is composed of two types of nodes with no 
links connecting nodes of the same type. For example, 
the e-commercial systems consisting of online users and 
products @, [1] , the scientific collaboration system con- 
sisting of authors and papers |, [l(j, and family name 
inheritance system consisting of babies and names [TTj 
are naturally described by such networks. Recently, some 
topological properties such as clustering coefficient and 
modularity of bipartite networks have been studied (l2| - 
[T3 |. However, one of the most difficult hurdles in ana- 
lyzing and visualizing bipartite network is the size of the 
real- world systems. The online commercial systems, for 
instance, can have thousands of products and even mil- 
lions of users. Given that most of the algorithms which 
are used to extract the properties of the bipartite network 
run in times that grow polynomially with the system size 
N, systems with huge sizes become a challenge. 

In order to deal with the problem mentioned above, 
a promising way is to consider some units of the system 
as almost indistinguishable and to merge them into one, 



i.e., to reduce the number of nodes and edges by means 
of a mapping of the network with N nodes and E edges 
into a smaller one with N nodes and E edges. Based 
on this concept, several coarse graining schemes for bi- 
nary networks including fc-core decomposition [l5|, flfjj . 
box-covering process [171 [l8j , geographical coarse grain- 



ing [19(, spectral coarse graining |2C 
posed. 



|21| have been pro- 
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Specifically, the fc-core decomposition intends to clas- 
sify nodes into different shells which represent their im- 
portance. This technique can be used to isolate the cen- 
tral core of a network, and was also shown to be ex- 
tremely effective for visualization purposes. The box- 
covering technique yields a new network which can pre- 
serve some of the topological features of the original ones. 
The geographical coarse graining uses a rcnormalization- 
group like numerical analysis to reduce the size of the 
networks while preserving the degree distribution, clus- 
tering coefficient and assortativity correlation. The spec- 
tral coarse graining methods, on the other hand, focus on 
the dynamic processes taking place on networks. They 
merge nodes based on the eigenvectors of different ma- 
trices, so that some dynamic properties such as random 
walk and synchronization of the original network arc kept 
unchanged. Mathematically, the spectral-based meth- 
ods are expressed as preserving some eigenvalues of the 
stochastic matrix or the graph Laplacian. Furthermore, 
some works have been further dedicated to coarse grain 
networks for dynamics of het erog eneous oscillators [22| 
and other critical phenomena [231 ] . 

Actually, a very close related problem is the commu- 
nity detection which groups nodes based on link density. 
Because of the importance and the complexity of find- 
ing meaningful communities, it is a fact that recent years 
have witnessed an explosion of research on community 
structure in graphs, and a huge number of methods or 
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techniques have been designed (2~I1 - [3C| (see [3l| for a re- 
view). However, there is often no clear statement on 
which properties of the initial network are preserved in 
the network of clusters. 

Though the coarse graining methods mentioned above 
are effective in binary networks, they usually have some 
limitations when extended to directed and bipartite net- 
works. In directed networks, the role of nodes in dynam- 
ics cannot be well characterized by the eigenvectors since 
imaginary value emerges when the adjacency and Lapla- 
cian matrix are asymmetric. This problem can be solved 
by directly using the paths to determine the similarity 
between nodes and finally preserve the dynamics prop- 
erties (synchronization) when merging nodes [3~3 . For 
bipartite networks, the situation can be even more com- 
plicated. There are two types of nodes in bipartite net- 
works and the dynamics on both types of nodes should be 
preserved. More importantly, the coarse graining method 
should preserve the intrinsic bipartite feature of the net- 
works (i.e. no link exists between nodes in each same 
type). However, if we regard the bipartite networks as 
binary ones and apply previous coarse graining methods, 
it will end up with merging nodes from different types 
into one. Furthermore, using the existing community de- 
tection methods to coarse grain bipartite networks may 
significantly change the network function [IH, HH . As a 
result, it is still a challenge to preserve both the network 
function and the bipartite property. 

In this paper, employing random walks to be the main 
characteristic [341 ] , we introduce a new spectral-based ap- 
proach to coarse grain bipartite networks. Unlike the 
coarse graining methods for binary networks, our goal is 
to obtain a reduced bipartite network that preserves both 
the original random walk properties and the bipartite fea- 
ture. In order to preserve the random walk properties of 
both types of nodes, two matrices (denoted by W m and 
W„) based on the stochastic matrix of the bipartite net- 
work are introduced, and a new coarse grain scheme re- 
lied on W m and W„ is designed. The obtained network 
remains bipartite and have very similar spectral proper- 
ties to the original bipartite network. Moreover, we vali- 
date our method by performing a direct test of the mean 
first passage time (MFPT) to artificial and real- world bi- 
partite networks. The new method is robust in various 
kinds of bipartite networks and the choices of sinks. Fi- 
nally, we remark that this method can be easily extend 
to preserve many other of spectral-determined dynamical 
properties in bipartite networks. 



II. SPECTRAL COARSE GRAINING METHOD 
ON BIPARTITE NETWORKS 

A. random walks on binary networks 

Random walks play a central role in dynamical proper- 
ties taking place on complex networks. Starting at some 
specified initial vertices, the walker jumps with equal 



probability from its current location to one of its nearest 
neighbors at each time step. Considering a binary net- 
work G = (V, E) with N nodes. The adjacency matrix 
A is the matrix with elements Aij = 1 if there is an edge 
connecting vertices i and j, otherwise 0. Let Pi(t) be the 
probability that the walker is at vertex i at time step t. If 
the walker is at vertex j at time step t— 1, the probability 
of taking a jump along any of its neighbors is 1/fcj. Ac- 
cordingly, pi (t) on an undirected binary network is given 
by 

3 3 

where kj is the degree of vertex j. As a matrix form, 
Eq. [T]can be written as p(t) = AD~ 1 p(t — 1) where p 
is the vector with elements pi and D is the diagonal ma- 
trix with the degrees of the vertices down its diagonal 
D = diag(di,d,2, Defining a stochastic matrix 

W = D _1 A, random walk in binary network can be 
characterized by the stochastic matrix W, and the ele- 
ment Wij describes the probability that a walker starts 
from node i to node j. 

In the context of transport phenomena or search on a 
network, MFPT (mean first passage time) is an impor- 
tant characteristic of random walks [HI, . To compute 
it exactly, one usually considers some nodes as traps. The 
normalized Laplacian matrix of the network is defined as 
L = I D 1 A, where I is the identity matrix. We use T 
to denote the set of traps and |T| to represent the num- 
ber of traps. For simplicity, we distinguish all nodes in 
the network by assigning each of them a unique number. 
We label consecutively all nodes, excluding those in T, 
from 1 to N — V and sinks are labeled from N — T + 1 to 
N. By suppressing the last |T| rows and columns of the 
normalized Laplacian matrix, we obtain a submatrix of 
the normalized Laplacian matrix L as L . 

It is shown in [40( that the trapping time or the first 
passage time T,;, which is defined as a particle first ar- 
riving at any one of the traps, given that it starts from 
node i, can be expressed as follows, 

jv— ]rj 

T i = E ^> ( 2 ) 

3=1 

where l^ 1 is the elements of matrix L'. Then the MFPT 
(T), which is defined as the average of Tj over all ini- 
tial nodes distributed uniformly over nodes including the 
traps, is given by 

jv-|r| iv-|r|jv-|r| 

i=l t=l j = l 

Eq. [3] can also be found in the literature in several equiv- 
alent forms [H, Hi| . 

Thus the exact solution of the MFPT of the unbiased 
random walk is given, independently from the number 
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and the location of the sinks. The Eq. [2] and Eq. [3] are 
very necessary since they can reduce the problem of com- 
puting the MFPT to calculating the inverse matrix L' 
and also can be used to check the MFPT of different net- 
works in the following sections, at least for networks with 
relatively small size. 



B. random walks on bipartite networks 

In bipartite networks, connection between vertices is 
also described by the adjacency matrix. However, since 
a bipartite network consists of two non-overlapping kinds 
of nodes and the links can only exist between two nodes 
from distinct sets. The adjacency matrix A of a bipar- 
tite network is defined as a matrix with order M x N, 
where M and N are the number of vertices of these two 
distinct sets. In this paper, we call two types of nodes 
as top and bottom nodes, respectively. If there is a link 
between vertices i in the top sets and j in the bottom 
sets, the element Aij = 1, otherwise Aij = 0. In bipar- 
tite networks, the random walk process is closely related 
to the information filtering algorithms [13, ■ Unlike 
binary networks, the stochastic matrix of a bipartite net- 
work is divided into two matrices. If a walker goes from 
the top set to the bottom set, the stochastic matrix U is 
with order M x N with elements Uij = /k% which de- 
scribe the probability from node i in the top set to node 
j in the bottom set. If the walker is from the bottom 
set to the top set, then V is with order N x M in which 
Vij = Aji/ki. U and V contained all the information of 
random walk in bipartite network. 

Now we define two new matrices W m and W„ as fol- 
lows: W m = U x V and W„ = V x U. Just like the 
stochastic matrix in binary networks, W m and W„ are 
square matrices. W r „ (W„) describes the random walk- 
ers going from top (bottom) nodes to top (bottom) nodes. 
These two matrices have some interesting properties. In 
particular, the largest eigenvalue of these two matrices is 
equal to 1 and the corresponding eigenvector is constant. 
Moreover, there are several largest eigenvalues of these 
two matrices with the same value. As discussed in [2lj . 
eigenvectors corresponding to the eigenvalues close to 1 of 
the stochastic matrix W capture the large-scale behavior 
of the random walk in binary network. The fact is also 
true in W m and W„ in bipartite network since they are 
square matrices just like W. As a result, our goal is to 
preserve the largest eigenvalues and eigenvectors of W m 
and W n . In this way, we can preserve the properties of 
random walk in bipartite network. 



C. Spectral coarse graining method for bipartite 
networks 

First of all, two nodes i and j with exactly the same 
neighbors should be merged since they cannot be distin- 
guished from the point of view of random walk. In terms 



of a eigenvector p a for any X a ^ of W m or W„ which 
is implied that p l a = p> a . Here, we denote the eigenvalues 
of a matrix W, n or W„ as X a and their corresponding 
eigenvectors p a . After merging, the new node will carry 
the sum of the edges of nodes i and j and the result- 
ing adjacency matrix of a bipartite network A will have 
order (M — 1) x N or M x (N — 1), with the correspond- 
ing line or column of the new node being the sum of the 
line (column) i and j. The properties of random walk in 
the new bipartite network is exactly the same as those 
in the original network. Moreover, if p l a ss p> a we could 
also group them in order to obtain an even smaller bi- 
partite network. By definition, if \p x a — vjJ oc e we could 
group node i and j together. Like ref. [20l . l2lj . the con- 
dition \p l a — p> a | oc e can be implemented by defining a 
parameter / as equally distributed intervals between the 
minimize and the maximum components of each eigen- 
vector p. The nodes whose eigenvector components in p 
fall in the same interval should be grouped. 

We summarize the bipartite network spectral coarse 
graining (BSCG) method in the following procedures: 

1. For any given bipartite network A, we can get two 
stochastic matrices U and V which gives the tran- 
sition probability from the top nodes to bottom 
nodes and bottoms nodes to top nodes in bipartite 
network, respectively; 

2. Using U and V, we can obtain two square stochas- 
tic matrices W m = U x V and W„ = V x U. 

3. Calculating the eigenvalues A Q and the correspond- 
ing eigenvectors p a of W r „ and W„ ; 

4. Merging nodes with similar components in the p a 
as one node. In the new adjacency matrix A, this 
node will carry the sum of the edges of original 
nodes. The nodes in the top set should be merged 
based on the eigenvectors of W rn and the nodes 
in the bottom set should be merged based on the 
eigenvectors of W n . 

The obtained adjacency matrix A is a weighted ma- 
trix. Thenew stochastic matrices^ U and V can be calcu- 
lated as Uij = Aij/ J2j Aij and V tj = Ajij £\ This 
method can be further extended to more than one eigen- 
vector. In this case, groups are defined as nodes with 
almost the same component over the eigenvectors corre- 
sponding to the largest nontrivial eigenvalues. Actually, 
choosing several largest nontrivial eigenvalues could bet- 
ter preserve the properties of random walk in bipartite 
network. 



III. RESULTS 

To validate the new method, we apply it to both arti- 
ficial and real- world bipartite networks. 
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A. Artificial Networks 



To begin with, let us consider an artificial bipartite 
networks with 1200 vertices, which are divided into 2 sets. 
The top set has 300 vertices and the bottom set has 900 
vertices, and nodes in each set are divided into 10 groups 
with equal size. In other words, each group in the top set 
has 30 vertices while there are 90 vertices in each group 
of the bottom set. The probability for existing a link 
between each pair of nodes in the same group is p\ while 
P2 is the corresponding probability outside the group. In 
this section, p\ = 0.4 and pi = 0.05. Since this kind of 
artificial networks have significant community structure, 
we call them community networks. Using the clustering 
method in original bipartite network [l3j . 10 communities 
are correctly detected from the network. 

To coarse grain this bipartite network, we used the 
largest three nontrivial eigenvectors f>2, P3 and p^. We 
set 7 = 12, which means that 12 intervals with equal size 
are divided between the largest and the lowest component 
of each eigenvector. Using the BSCG method, we get a 
rather small network with 391 vertices. Clearly, since 
the BSCG method and community detection method fo- 
cus on different properties in the bipartite network, the 
nodes grouping results are different. Table Q] shows the 
three largest nontrivial eigenvalues of W m and W„ be- 
fore and after the coarse graining. Though the largest 
nontrivial eigenvalues of these two matrices are the same, 
note that the matrices W m and W„ contain different in- 
formation. Specifically, W m is with order M x M and 
describes the probability that the walker is from the top 
set to the top set, while W„ considers the information 
that the walker is from the bottom set to the bottom 
set. Moreover, the eigenvectors of the largest nontrivial 
eigenvalues of W m and W„ arc respectively correspond- 
ing to the nodes in the top set and bottom set. Thus, the 
eigenvectors of both matrices should be considered and 
the nontrivial eigenvalues of both matrices should be pre- 
served. The eigenvalues of W m and W„ of the reduced 
network can be calculated from the reduced adjacency 
matrix A, using the same way described in the previous 
section. As expected from our pcrturbative derivation, 
the largest three eigenvalues are effectively preserved in 
the coarse-grained network. 

Moreover, we also apply the BSCG method to ER ran- 
dom bipartite networks and obtain similar results (see 
also Table JJ. In the ER random bipartite networks, the 
probability of having a link between two vertices of dif- 
ferent sets is p = 0.01 and the top set contains 1000 
vertices while bottom set has 800 vertices. In the ER 
bipartite network, we also focus on the three largest non- 
trivial eigenvectors and set I = 20. The results in Table U 
indicates that our new method is robust in various kinds 
of artificial networks. 



TABLE I: The three largest nontrivial eigenvalues of W m and 
W„ in the artificial including Community bipartite network 
and ER random bipartite network. A Q and X a are the eigen- 
values before and after coarse graining, respectively. 



Network 


a 


A Q (W m ) 


\ a (W m ) 


A Q (W„) 


A Q (W„) 




2 


0.4405 


0.4336 


0.4405 


0.4336 


Community network 


3 


0.4342 


0.4279 


0.4342 


0.4279 




4 


0.4180 


0.4076 


0.4180 


0.4076 




2 


0.3986 


0.3933 


0.3986 


0.3933 


ER network 


3 


0.3908 


0.3833 


0.3908 


0.3833 




4 


0.3865 


0.3784 


0.3865 


0.3784 



B. Real-world Bipartite Networks 




FIG. 1: (Color online) The top figure is a social bipartite net- 
works of terrorists with N = 73. Node size is proportional to 
their degree. The two different colors represent two kinds of 
vertices. The blue one indicates the people and the red one 
represents the organizations they belong to. The Bottom fig- 
ure is the coarse-grained network (N = 23) according to our 
method based on random walk properties. Node size is pro- 
portional to the its strength in this weighted network. Colors 
still correspond to these two kinds of vertices. 

In this subsection, we apply our method to some real- 
world networks. First, we apply the method to a social 
network of terrorists and the data was collected from 430 
websites. The network was sampled from the data col- 
lected over a period from Oct. 1st, 1949 to May 1st, 2012, 
and it was based on the relationship between terrorists 
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FIG. 2: (Color online) The evolution of the three largest nontrivial eigenvalues A2, A3 and A4 as a function of the size of the 
coarse-grained network, (a) - (c) The original network is movielens network, (d) - (f) The original network is netflix network. 
Red circles correspond to a random merging nodes, and the blue squares represent the BSCG Method. The results compared 
with random merging show the advantages of BSCG method when reducing the size of the bipartite network. 



TABLE II: The three largest nontrivial eigenvalues of W m 
and W n in real-world bipartite networks including a small 
terrorists' social network, movielens network and netflix net- 
work. A a and A Q are respectively the eigenvalues before and 
after coarse graining. 



Network 


a 


A Q (W m ) 


A Q (W m 


) K (W n ) 


A a (W n ) 




2 


0.8070 


0.8059 


0.8070 


0.8059 


Terrorists 


3 


0.7259 


0.7256 


0.7259 


0.7256 




4 


0.6013 


0.5732 


0.6013 


0.5732 




2 


0.4180 


0.4093 


0.4180 


0.4093 


Movielens 


3 


0.2436 


0.2305 


0.2436 


0.2305 




4 


0.2075 


0.1890 


0.2075 


0.1890 




2 


0.2575 


0.2535 


0.2575 


0.2535 


Netflix 


3 


0.2209 


0.2168 


0.2209 


0.2168 




4 


0.2148 


0.1971 


0.2148 


0.1971 



and their organizations. In this small social network, we 
focus on the giant component which are composed of 73 
nodes in total, including 20 organizations and 53 peo- 
ple. The structure of the original network can be seen 
in Fig. 1, where the blue square accounts for the people 
and the red circle represents the organizations. The links 
between a person and an organization indicates that the 
person belongs to the organization. To coarse grain this 
network, we set 1 = 5 and obtain a reduced network with 
23 nodes, which is shown in the bottom figure in Fig. 1. 
The three largest nontrivial eigenvalues before and after 
coarse graining arc shown in Table |TTJ Clearly, all these 
eigenvalues are well preserved. Using our new method, 



the resulting network is also a bipartite network. Fur- 
thermore, we also try the method introduced in f2lj on 
this real- world network, the resulting network is a binary 
one and the original two different kinds of nodes are in- 
distinguishable . 

As a further step, we apply our method to two on- 
line commercial networks: MovieLens and Netflix. The 
movielens network was sampled from the data collected 
over a seven-month period from September 19th, 1997 
through April 22nd, 1998. The data consisted of 100,000 
movie ratings from 943 users on 1,682 items. Each user 
sampled had rated at least 20 items. Users can vote 
for movies with five level ratings from 1 (i.e., worst) to 
5 (i.e., best). Here we only consider the ratings higher 
than 2, so the data contains 82,520 user-object pairs. 
This sampled data is freely available at 4l| . The Netflix 
network was randomly sampled from the huge data set 
provided for the Netflix Prize. The original data is freely 
available at [42|. It has 480,189 users, 17,770 items and 
100,480,507 ratings. In the paper, we only consider a 
subset of this huge data set. The subset consists of 3,000 
users, 2,779 movies, and 824,802 links. Similar to the 
MovieLens data, only the links with ratings no less than 
3 are considered. After data filtering, there are 197,428 
links left in the netflix network. 

We first investigate how these three nontrivial eigenval- 
ues evolve when the nodes in the networks are emerged. 
Fig. 2 shows the changes of these eigenvalues as a func- 
tion of network size N + M. The red line corresponds to 
a random merging of the nodes into groups, and the blue 
curve shows the results when using the BSCG method in 
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FIG. 3: (Color online) Comparison of the MFPT. The walker 
starts at each node in the top set and the sink i is selected 
as the node with the strongest weight in the bottom set. 
The blue circles represent the average MFPT ranked for each 
group in the original network. The MFPT of the correspond- 
ing nodes in the coarse-grained network is displayed with red 
lines, (a) The MovieLens network using BSCG method, (b) 
Nodes merged randomly in Movielens network, (c) The Net- 
flix network using BSCG method, (d) Nodes merged ran- 
domly in Netflix network. Insets: Comparison of the exact 
MFPT between original and the reduced bipartite network. 
Slope 1 represents the well preserved MFPT in the reduced 
network. 



these eigenvalues A2, A3, and A4, i.e., nodes are grouped 
if their components in these eigenvectors according to A2, 
A3, and A4 are sufficient close to each other (p l a « p> a ). 
The different values of network size ./V + M correspond 
to different choices of the number of intervals / defined 
between the smallest and the largest component in the 
eigenvectors. Generally speaking, a small / yields a small 
network size. According to the results shown in Fig. 2, 
these three eigenvalues are well preserved even though 
the network size is significantly reduced. Actually, / can 
be regarded as a parameter to determine the how accu- 
rate the eigenvalues are expected to be preserved, bigger 
I can improve the precision of the method while resulting 
a bigger size of the reduced network. 

In Fig. 2, it is also clearly shown that if nodes are 
merged randomly, the eigenvalues change dramatically . 
Consequently, the properties of random walk will be dif- 
ferent from those in the original network. If the network 
are coarse grained according to the BSCG method, the 
nature of random walk in bipartite network could be ef- 
fectively preserved. In order to keep eigenvalues almost 
unchanged, we set / = 12 in movielens network and get a 
reduced network with size N + M = 500, which is 20% as 
big as the original network. The three largest eigenvalues 
in the reduced network can be seen in table HU In netfl 
network, we set / = 60 and finally 657 nodes left which 



is about 10% as big as the size of the original network. 
The table [TT] also shows the well preserved eigenvalues of 
W m and W„ in netflix. 

A more direct test of our method is to compare the 
mean first passage time (MFPT) from node i to node j, 
which is denoted by T^- in the original and reduced net- 
works. In the context of transport phenomena or naviga- 
tion on a network, MFPT is an important characteristic 
of random walk. We label the nodes in the bipartite net- 
work from 1 to N' (N' = N + M) and consider the bi- 
partite network as a binary one. In this way, all the cases 
for random walk in bipartite networks are included, i.e. 
the random walker can start from one type of nodes and 
finally arrive at either the same type or the other type 
of nodes. We consider multi-sink random walk problem 
and the MFPT can be exactly calculated by Eq. [3l 

In order to compare the MFPT between original and 
reduced network in movielens, we use the coarse grained 
network with N 1 = 500 obtained above. Specifically, we 
consider the walker starts at each node in the top set and 
define the node i with the strongest node weight as the 
sink in the bottom set. In Fig. 3 (a), blue circles repre- 
sent the MFPT from each node in the top set to nodes 
belonging to the group i in the bottom set in the original 
network. The MFPT to the group i in the bottom set 
in the reduced network is displayed with red lines. The 
exact overlap indicates that the MFPT is well preserved 
in the reduced network. The inset of Fig. 3 (a) shows 
the relationship between the MFPT of original network 
and that of reduced network. The result implies almost 
equal MFPT in both the original and the reduced net- 
work, given the same the source node and the sink. The 
slope of the curve is 0.996 and the goodness of linear 
fit is R 2 = 0.998. However, the random coarse graining 
method significantly destroys the MFPT. As shown in 
Figure 3 (b), the MFPT between original network and 
reduced network differs from each other. From the inset 
of Fig. 3 (b) , it is shown that there is no significant rela- 
tionship between these two MFPT. We further test the 
MFPT in the Netflix network and its coarse gained net- 
works from BSCG method and random method. Similar 
results are obtained (see Fig. 3(c) and (d)). 

Besides computing the exact MFPT from Eq. [31 we 
also use the numerical simulation of the random walk pro- 
cess to test the BSCG method. That is to say, we put a 
walker on each node in the bipartite network and it trav- 
els based on the stochastic matrices (U and V). Similar 
results to Fig. 3 are obtained, namely the reduced net- 
work from BSCG effectively preserves the MFPT while 
the random coarse grained network significantly changes 
the MFPT. Finally, we remark that the results in Fig. 
3 is consistent in different choices of sinks. No matter 
the walker starts and ends at nodes in the same or dif- 
ferent set of nodes, the MFPT of the reduced network 
from BSCG method well overlaps with that of the origi- 
nal network. Taken together, the BSCG method is very 
robust in its performance. 
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IV. CONCLUSION 

One of the most difficult hurdles in the analysis of com- 
plex network is the huge size of the real-world systems. If 
the network has more than 10 5 nodes, many algorithms 
are quite slow and sometimes even not doable. In order 
to solve this challenge, some coarse grain method based 
on complex networks are proposed, which mainly focus 
on the one-mode network in which only one type of nodes 
exist. 

In this paper, we proposed a new coarse grain method 
for bipartite network with respect to random walk. After 
introducing two square stochastic matrices W m and W„, 
we find that their three largest nontrivial eigenvalues can 
effectively represent the properties of random walk. To 
merge node with similar components of these eigenvec- 
tors corresponding to the eigenvalues, the reduced net- 
work with well preserved eigenvalues of stochastic matrix 
is obtained. Moreover, a straight test based on the mean 
first passage time is carried out in two real-world bipar- 
tite networks, and this property is well preserved in the 



reduced bipartite network. We believe that this method 
can be easily extend to preserve many other of spectral- 
determined dynamical properties in bipartite networks. 
Moreover, we have shown that for a bipartite network 
that the coarse graining provides a highly representative 
approximation of the initial network, giving rise to a way 
to circumvent the large size of complex networks for their 
analysis and visualization. 

Finally, from a computational point of view, the first 
eigenvectors are fast to calculate with the existing opti- 
mized methods for sparse matrices, including the Lanczos 
and QL algorithms. Therefore our method can be easily 
utilized on large bipartite networks. 
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